Search CORE

117 research outputs found

Considering Transposable Element Diversification in De Novo Annotation Approaches

Author: A Agrawal
A Coghlan
A Herpin
A Loytynoja
A Martin
A Santangelo
AL Price
AM Waterhouse
B McClintock
C Bergman
C Feschotte
C Feschotte
C Notredame
CA Cuomo
Catherine Feuillet
CMM Bergman
D Finnegan
E Eichler
E Mayr
E Paux
Elodie Duprat
F Teixeira
G Abrusan
G Benson
G Bourque
G Yang
H Quesneville
H Quesneville
H Quesneville
Hadi Quesneville
I Dondoshansky
J Jurka
J Jurka
J Newman
J Thompson
JFY Brookfield
K Katoh
K Rasmussen
L Orgel
L Zhou
M Brent
M Lynch
MD Adams
N Buisine
N Jiang
P Abad
PS Schnable
R Cordaux
R Kolpakov
R Slotkin
RC Edgar
S Altschul
S Kurtz
S Saha
S Schuster
S Tempel
S Wessler
T Wicker
Timothée Flutre
V Nene
W Gu
W Kent
X Huang
Y Gray
Ying Xu
Z Bao
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Transposable elements (TEs) are mobile, repetitive DNA sequences that are almost ubiquitous in prokaryotic and eukaryotic genomes. They have a large impact on genome structure, function and evolution. With the recent development of high-throughput sequencing methods, many genome sequences have become available, making possible comparative studies of TE dynamics at an unprecedented scale. Several methods have been proposed for the de novo identification of TEs in sequenced genomes. Most begin with the detection of genomic repeats, but the subsequent steps for defining TE families differ. High-quality TE annotations are available for the Drosophila melanogaster and Arabidopsis thaliana genome sequences, providing a solid basis for the benchmarking of such methods. We compared the performance of specific algorithms for the clustering of interspersed repeats and found that only a particular combination of algorithms detected TE families with good recovery of the reference sequences. We then applied a new procedure for reconciling the different clustering results and classifying TE sequences. The whole approach was implemented in a pipeline using the REPET package. Finally, we show that our combined approach highlights the dynamics of well defined TE families by making it possible to identify structural variations among their copies. This approach makes it possible to annotate TE families and to study their diversification in a single analysis, improving our understanding of TE dynamics at the whole-genome scale and for diverse species

Public Library of Science (PLOS)

Crossref

HAL Clermont Université

Directory of Open Access Journals

PubMed Central

ProdInra

Hal-Diderot

High-quality de novo assembly of the apple genome and methylome dynamics of early fruit development

Author: Aubourg S.
Becker C.
Bianco L.
Bucher E.
Celton J. M.
Choisne N.
Daccord N.
Di Pierro E. A.
Durel C. E.
Gaillard S.
Gouzy J.
Guérif P.
Jasper D.
Laurens F.
Lespinasse Y.
Linsmith Gareth
Micheletti D.
Muranty H.
Quesneville H.
Rees G.
Schijlen E.
Troggio M.
van de Geest H.
van de Weg E.
Velasco R.
Weigel D.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Using the latest sequencing and optical mapping technologies, we have produced a high-quality de novo assembly of the apple (Malus domestica Borkh.) genome. Repeat sequences, which represented over half of the assembly, provided an unprecedented opportunity to investigate the uncharacterized regions of a tree genome; we identified a new hyper-repetitive retrotransposon sequence that was over-represented in heterochromatic regions and estimated that a major burst of different transposable elements (TEs) occurred 21 million years ago. Notably, the timing of this TE burst coincided with the uplift of the Tian Shan mountains, which is thought to be the center of the location where the apple originated, suggesting that TEs and associated processes may have contributed to the diversification of the apple ancestor and possibly to its divergence from pear. Finally, genome-wide DNA methylation data suggest that epigenetic marks may contribute to agronomically relevant aspects, such as apple fruit development

HAL-UNICE

Archivio istituzionale della ricerca - Fondazione Edmund Mach

Linkage disequilibrium in young genetically isolated Dutch population

Author: A Collins
A Kong
A Wright
B Devlin
B Muller-Myhsok
C Zapata
D Fallin
D Zaykin
DE Reich
DJ Schaid
DJ Schaid
E Sobel
ES Lander
GR Abecasis
H Quesneville
JC Venter
KM Weiss
L Kruglyak
M Abney
M Boehnke
MD Teare
N Risch
P Zavattari
RC Lewontin
SK Service
T Varilo
YS Aulchenko
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2004
Field of study

The design and feasibility of genetic studies of complex diseases are critically dependent on the extent and distribution of linkage disequilibrium (LD) across the genome and between different populations. We have examined genomewide and region-specific LD in a young genetically isolated population identified in the Netherlands by genotyping approximately 800 Short Tandem Repeat markers distributed genomewide across 58 individuals. Several regions were an

Crossref

Erasmus University Digital Repository

Correlation of LNCR rasiRNAs Expression with Heterochromatin Formation during Development of the Holocentric Insect Spodoptera frugiperda

Author: A Criniti
A Murakami
A Pelisson
A Verdel
AA Aravin
AC Chueh
B Czech
BY Lu
C Klattenhoff
CN Topp
D Fagegaltier
DM Carone
E d'Alençon
Emmanuelle d'Alençon
Emmanuelle Permal
ER Havecker
François Cousserans
G Jagadeeswaran
H Quesneville
Hadi Quesneville
HB Megosh
HH Kazazian Jr
HR Lee
J Brennecke
J Brennecke
J Brosius
JH Bergmann
K Saito
KA Senti
LH Wong
LS Gunawardane
M Gerbal
M Halic
M Mandrioli
M Mandrioli
M Wassenegger
MA Matzke
Michael Freitag
MS Klenov
N Rhind
Philippe Fournier
PY Chen
R Santoro
S Desset
S Houwing
S Kawaoka
S Shpiz
Slavica Stanojcic
Sylvie Gimenez
T Wicker
TA Farazi
TA Volpe
VV Vagin
Y Du
Y Kawamura
YJ Lu
Z Lippman
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Repeat-associated small interfering RNAs (rasiRNAs) are derived from various genomic repetitive elements and ensure genomic stability by silencing endogenous transposable elements. Here we describe a novel subset of 46 rasiRNAs named LNCR rasiRNAs due to their homology with one long non-coding RNA (LNCR) of Spodoptera frugiperda. LNCR operates as the intermediate of an unclassified transposable element (TE-LNCR). TE-LNCR is a very invasive transposable element, present in high copy numbers in the S. frugiperda genome. LNCR rasiRNAs are single-stranded RNAs without a prominent nucleotide motif, which are organized in two distinct, strand-specific clusters. The expression of LNCR and LNCR rasiRNAs is developmentally regulated. Formation of heterochromatin in the genomic region where three copies of the TE-LNCR are embedded was followed by chromatin immunoprecipitation (ChIP) and we observed this chromatin undergo dynamic changes during development. In summary, increased LNCR expression in certain developmental stages is followed by the appearance of a variety of LNCR rasiRNAs which appears to correlate with subsequent accumulation of a heterochromatic histone mark and silencing of the genomic region with TE-LNCR. These results support the notion that a repeat-associated small interfering RNA pathway is linked to heterochromatin formation and/or maintenance during development to establish repression of the TE-LNCR transposable element. This study provides insights into the rasiRNA silencing pathway and its role in the formation of fluctuating heterochromatin during the development of one holocentric organism

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

ProdInra

A high-quality sequence of Rosa chinensis to elucidate genome structure and ornamental traits

Author: A. Berard
A. Chastellier
C. Maliepaard
D. Lakhwani
D. Schulz
E. Bucher
E. Neu
E. Schijlen
F. Foucher
H. Quesneville
H. Van de Geest
I. Kirov
J. Clotault
J. De Riek
J. Jeauffre
K. Kawamura
K. Van Laere
L. Hamama
L. Hibrand-Saint Oyant
L. Leus
L. Voisine
M. Linde
M.C. Le Paslier
N. Choisne
N. Daccord
N.N. Zhou
P. Arens
P.M. Bourke
R. Bounon
R. Smulder
R. Voorrips
S. Aubourg
S. Balzergue
S. Gaillard
S. Sakr
T. Borm
T. Debener
T. Hesselink
T. Ruttink
T. Thouroude
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2018
Field of study

Rose is the worlds most important ornamental plant with economic, cultural and symbolic value. Roses are cultivated worldwide and sold as garden roses, cut flowers and potted plants. Rose has a complex genome with high heterozygosity and various ploidy levels. Our objectives were (i) to develop the first high-quality reference genome sequence for the genus Rosa by sequencing a doubled haploid, combining long and short read sequencing, and anchoring to a high-density genetic map and (ii) to study the genome structure and the genetic basis of major ornamental traits. We produced a haploid rose line from R. chinensis "Old Blush" and generated the first rose genome sequence at the pseudo-molecule scale (512 Mbp with N50 of 3.4 Mb and L75 of 97). The sequence was validated using high-density diploid and tetraploid genetic maps. We delineated hallmark chromosomal features including the pericentromeric regions through annotation of TE families and positioned centromeric repeats using FISH. Genetic diversity was analysed by resequencing eight Rosa species. Combining genetic and genomic approaches, we identified potential genetic regulators of key ornamental traits, including prickle density and number of flower petals. A rose APETALA2 homologue is proposed to be the major regulator of petals number in rose. This reference sequence is an important resource for studying polyploidisation, meiosis and developmental processes as we demonstrated for flower and prickle development. This reference sequence will also accelerate breeding through the development of molecular markers linked to traits, the identification of the genes underlying them and the exploitation of synteny across Rosaceae

Novel transposable elements from Anopheles gambiae

Abstract Background Transposable elements (TEs) are DNA sequences, present in the genome of most eukaryotic organisms that hold the key characteristic of being able to mobilize and increase their copy number within chromosomes. These elements are important for eukaryotic genome structure and evolution and lately have been considered as potential drivers for introducing transgenes into pathogen-transmitting insects as a means to control vector-borne diseases. The aim of this work was to catalog the diversity and abundance of TEs within the <it>Anopheles gambiae </it>genome using the PILER tool and to consolidate a database in the form of a hyperlinked spreadsheet containing detailed and readily available information about the TEs present in the genome of <it>An. gambiae</it>. Results Here we present the spreadsheet named AnoTExcel that constitutes a database with detailed information on most of the repetitive elements present in the genome of the mosquito. Despite previous work on this topic, our approach permitted the identification and characterization both of previously described and novel TEs that are further described in detailed. Conclusions Identification and characterization of TEs in a given genome is important as a way to understand the diversity and evolution of the whole set of TEs present in a given species. This work contributes to a better understanding of the landscape of TEs present in the mosquito genome. It also presents a novel platform for the identification, analysis, and characterization of TEs on sequenced genomes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Context-driven discovery of gene cassettes in mobile integrons using a computational grammar

Author: A Moura
ACE Darling
AL Delcher
AL Delcher
CJ van Rijsbergen
D Frishman
DA Rowe-Magnus
DB Searls
E Rivas
Enrico Coiera
F Baquero
F Meyer
F Meyer
Guy Tsafnat
H Quesneville
HW Stokes
HW Stokes
IT Paulsen
J Fleiss
J Landis
Jaron Schaeffer
Jon R Iredell
K Rutherford
L Stein
M Ashburner
M Kanehisa
MA Andrade
MJ Joss
R Overbeek
RM Hall
RS Levings
S Ji
S Leung
Sally R Partridge
SF Altschul
SR Partridge
U Bohnebeck
WR Pearson
Y Boucher
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Gene discovery algorithms typically examine sequence data for low level patterns. A novel method to computationally discover higher order DNA structures is presented, using a context sensitive grammar. The algorithm was applied to the discovery of gene cassettes associated with integrons. The discovery and annotation of antibiotic resistance genes in such cassettes is essential for effective monitoring of antibiotic resistance patterns and formulation of public health antibiotic prescription policies. Results We discovered two new putative gene cassettes using the method, from 276 integron features and 978 GenBank sequences. The system achieved <it>κ </it>= 0.972 annotation agreement with an expert gold standard of 300 sequences. In rediscovery experiments, we deleted 789,196 cassette instances over 2030 experiments and correctly relabelled 85.6% (<it>α </it>≥ 95%, <it>E </it>≤ 1%, mean sensitivity = 0.86, specificity = 1, F-score = 0.93), with no false positives. Error analysis demonstrated that for 72,338 missed deletions, two adjacent deleted cassettes were labeled as a single cassette, increasing performance to 94.8% (mean sensitivity = 0.92, specificity = 1, F-score = 0.96). Conclusion Using grammars we were able to represent heuristic background knowledge about large and complex structures in DNA. Importantly, we were also able to use the context embedded in the model to discover new putative antibiotic resistance gene cassettes. The method is complementary to existing automatic annotation systems which operate at the sequence level.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Macquarie University ResearchOnline

Repetitive Elements May Comprise Over Two-Thirds of the Human Genome

Author: A Nekrutenko
A. P. Jason de Koning
AFA Smit
AL Price
AR Quinlan
C Feschotte
DA Ray
David D. Pollock
E Lerat
EE Eichler
EF Kirkness
G Achaz
G Benson
G Lunter
Gregory P. Copenhaver
H Quesneville
HH Kazazian Jr
J Brosius
J Jurka
J Jurka
J Jurka
JS Mattick
JU Pontius
K Lindblad-Toh
M Pheasant
MA Batzer
Mark A. Batzer
MC Frith
R Li
RC Edgar
RM Kuhn
S Karlin
S Kurtz
SF Altschul
TA Castoe
Todd A. Castoe
TS Mikkelsen
W Gu
Wanjun Gu
WC Warren
Z Bao
Publication venue: Public Library of Science
Publication date: 01/12/2011
Field of study

Transposable elements (TEs) are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo “clouds”). We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%–69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM), to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (∼25 bp). Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed “element-specific” P-clouds (ESPs) to identify novel Alu and MIR SINE elements, and using it we identified ∼100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Louisiana State University

Meeting the Challenges Facing Wheat Production The Strategic Research Agenda of the Global Wheat Initiative

Wheat occupies a special role in global food security since, in addition to providing 20% of our carbohydrates and protein, almost 25% of the global production is traded internationally. The importance of wheat for food security was recognised by the Chief Agricultural Scientists of the G20 group of countries when they endorsed the establishment of the Wheat Initiative in 2011. The Wheat Initiative was tasked with supporting the wheat research community by facilitating col-laboration, information and resource sharing and helping to build the capacity to address chal-lenges facing production in an increasingly variable environment. Many countries invest in wheat research. Innovations in wheat breeding and agronomy have delivered enormous gains over the past few decades, with the average global yield increasing from just over 1 tonne per hectare in the early 1960s to around 3.5 tonnes in the past decade. These gains are threatened by climate change, the rapidly rising financial and environmental costs of fertilizer, and pesticides, combined with declines in water availability for irrigation in many regions. The international wheat research community has worked to identify major opportunities to help ensure that global wheat pro-duction can meet demand. The outcomes of these discussions are presented in this paper

Rothamsted Repository

Sequencing of Pooled DNA Samples (Pool-Seq) Uncovers Complex Dynamics of Transposable Element Insertions in Drosophila melanogaster

Author: A Burt
AA Aravin
AFA Smit
Andrea J. Betancourt
AS Fiston-Lavier
AS Goldman
B Charlesworth
B Charlesworth
B Charlesworth
B Charlesworth
C Alkan
C Bartolome
C Bartolome
C Bartolome
C Biemont
C Biemont
C Biemont
C Hoogland
C Rizzon
CD Malone
CE Perez-Gonzalez
CH Langley
Christian Schlötterer
CM Bergman
CM Bergman
D Houle
DA Petrov
DA Petrov
DA Petrov
DA Petrov
David J. Begun
DJ Finnegan
E Lerat
E Montgomery
EA Montgomery
EA Montgomery
ES Dolgin
F Hormozdiari
F Tajima
GF Berriz
GJ Hannon
GM Rubin
H Biessmann
H Biessmann
H Innan
H Li
H Li
H Li
H Quesneville
IK Jordan
J Brennecke
J Cohen
J Gonzalez
J Gonzalez
J Hermisson
JA Anderson
JM Braverman
JM Smith
JP Blumenstiel
JS Kaminker
JT Robinson
KM Hazzouri
L Cooley
LN van de Lagemaat
M Lipatov
M Papaceit
M Puig
M Steinemann
MA Jensen
MD Adams
MG Kidwell
MG Kidwell
ND Singh
NJ Bowen
PD Sniegowski
PJ Daborn
R Bergero
R Drysdale
R Kofler
Robert Kofler
RV Pandey
SV Nuzhdin
T Wicker
TB Sackton
VV Kapitonov
W Wang
WA Odgers
WF Eanes
WG Hill
WJ Miller
X Maside
X Maside
Y Kim
YC Lee
YT Aminetzach
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Transposable elements (TEs) are mobile genetic elements that parasitize genomes by semi-autonomously increasing their own copy number within the host genome. While TEs are important for genome evolution, appropriate methods for performing unbiased genome-wide surveys of TE variation in natural populations have been lacking. Here, we describe a novel and cost-effective approach for estimating population frequencies of TE insertions using paired-end Illumina reads from a pooled population sample. Importantly, the method treats insertions present in and absent from the reference genome identically, allowing unbiased TE population frequency estimates. We apply this method to data from a natural Drosophila melanogaster population from Portugal. Consistent with previous reports, we show that low recombining genomic regions harbor more TE insertions and maintain insertions at higher frequencies than do high recombining regions. We conservatively estimate that there are almost twice as many “novel” TE insertion sites as sites known from the reference sequence in our population sample (6,824 novel versus 3,639 reference sites, with on average a 31-fold coverage per insertion site). Different families of transposable elements show large differences in their insertion densities and population frequencies. Our analyses suggest that the history of TE activity significantly contributes to this pattern, with recently active families segregating at lower frequencies than those active in the more distant past. Finally, using our high-resolution TE abundance measurements, we identified 13 candidate positively selected TE insertions based on their high population frequencies and on low Tajima's D values in their neighborhoods

University of Liverpool Repository

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare